Improving false discovery rate estimation
نویسندگان
چکیده
MOTIVATION Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR). However, rigorous control of the FDR at a preselected level is often impractical. Consequently, it has been suggested to use the q-value as an estimate of the proportion of false discoveries among a set of significant findings. However, such an interpretation of the q-value may be unwarranted considering that the q-value is based on an unstable estimator of the positive FDR (pFDR). Another method proposes estimating the FDR by modeling p-values as arising from a beta-uniform mixture (BUM) distribution. Unfortunately, the BUM approach is reliable only in settings where the assumed model accurately represents the actual distribution of p-values. METHODS A method called the spacings LOESS histogram (SPLOSH) is proposed for estimating the conditional FDR (cFDR), the expected proportion of false positives conditioned on having k 'significant' findings. SPLOSH is designed to be more stable than the q-value and applicable in a wider variety of settings than BUM. RESULTS In a simulation study and data analysis example, SPLOSH exhibits the desired characteristics relative to the q-value and BUM. AVAILABILITY The Web site www.stjuderesearch.org/statistics/splosh.html has links to freely available S-plus code to implement the proposed procedure.
منابع مشابه
The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data
Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...
متن کاملEstimation of False Discovery Rate Using Permutation P -Values with Different Discrete Null Distributions
The false discovery rate (FDR) is a multiple testing error rate which describes the expected proportion of expected type I errors among the total number of rejected hypotheses. Benjamini and Hochberg introduced this quantity and provided an estimator that is conservative when the number of true null hypotheses, m0, is smaller than the number of tests, m. Replacing m with m0 in Benjamini and Hoc...
متن کاملLocal false discovery rate estimation using feature reliability in LC/MS metabolomics data
False discovery rate (FDR) control is an important tool of statistical inference in feature selection. In mass spectrometry-based metabolomics data, features can be measured at different levels of reliability and false features are often detected in untargeted metabolite profiling as chemical and/or bioinformatics noise. The traditional false discovery rate methods treat all features equally, w...
متن کاملSimple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions.
Multiple comparison procedures that control a family-wise error rate or false discovery rate provide an achieved error rate as the adjusted p-value or q-value for each hypothesis tested. However, since achieved error rates are not understood as probabilities that the null hypotheses are true, empirical Bayes methods have been employed to estimate such posterior probabilities, called local false...
متن کاملOn"Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates": Does a large number of tests obviate confidence intervals of the FDR?
A previously proved theorem gives sufficient conditions for an estimator of the false discovery rate (FDR) to conservatively converge to the FDR with probability 1 as the number of hypothesis tests increases, even for small sample sizes. It does not follow that several thousand tests ensure that the estimator has moderate variance under those conditions. In fact, they can hold even if the test ...
متن کاملTwo-stage stepup procedures controlling FDR
A two-stage stepup procedure is defined and an explicit formula for the FDR of this procedure is derived under any distributional setting. Sets of critical values are determined that provide a control of the FDR of a two-stage stepup procedure under iid mixture model. A class of two-stage FDR procedures modifying the Benjamini–Hochberg (BH) procedure and containing the one given in Storey et al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 20 11 شماره
صفحات -
تاریخ انتشار 2004